Health Service Modelling Associates Programme
Don’t forget that you can book time to see me.
Slots are available at
Slots are currently available up until the second week of June.
Book via this link
Pop me a message if you can’t make those times and we can work something out.
Notebook LM is a Google product that allows you to pass in specfic sources that will be used by a generative AI model.
You can feed it a range of sources, including
(Thanks to Joel for originally making me aware of this tool)
Because it can be trained on specific sources, you can use it to remind yourself of the HSMA way of doing things.
Notebook LM has the really nice feature of actually referring back to the source it’s used
But in code, that means you get the sources appearing in [] in a way that would break the code…
The ‘deep dive’ feature is a way to get a good summary of complex information like scientific papers.
It generates a podcast!
This repository contains a script to automatically generate documentation from a codebase.
https://github.com/The-Pocket/PocketFlow-Tutorial-Codebase-Knowledge
It’s designed to write beginner-friendly documentation with loads of analogies and clear breakdowns of what is going on.
Documentation is the thing no-one wants to do…
And it always goes out of date!
In addition, if you’re trying to work with someone else’s code, it can be really hard to get your head around how everything is done.
The framework
(I’ve then been converting the output into Quarto for easy integration with existing documentation)
It generates code examples…
And diagrams…
Because of the way this code is designed, you can modify the prompt that gets fed in.
It just lives in text in the nodes.py file.
So you can ask it to make the documentation more or less beginner friendly, or add very specific requests.
Anything you feed into this may be fair game for model training.
And you certainly don’t want to feed in anything that needs to remain private!
But if your repository is already open-source, then you may wish to consider it.
Google’s most advanced model (gemini-2.5-pro-exp-03-25) is currently free for roughly 25 requests a day.
For me, that was enough to generate two sets of documentation per day before hitting the limit.
I’ll put together a quick guide on the ‘how’ in the next few days, but if you’re keen to try it out sooner, have a go at following the instructions in the repository readme (and message me if you get stuck).
And have a read of a sample generated here: https://sammirosser.com/vidigi_autodoc_test/
There’s a new chapter in the DES book on getting distributions from real data!
This covers using the fitter package.
Let’s assume we start with a dataframe of historical activity times.
You pass in a list of values…
We can then apply the ‘get_best’ method to our object to return
And this can be used to then pass into your model.
There is also a new chapter on event logging.
We will be using the term ‘event logging’ to describe the process of generating a step-by-step log of what happens to each entity as they pass through our system.
The resulting file will be an ‘event log’.
…
We have several mandatory columns:
entity_id: a unique identifider to allow us to follow a given entity through their journey
event_type: this column is used to distinguish between three key kinds of events:
event: this column further breaks down what is happening during each event type, such as what stage of the system people are waiting to interact with
time: this can be an absolute timestamp in the form of a datetime (e.g. 2027-01-01 23:01:47), or a relative timestamp in time units from the start of the simulation.
run: in a multi-run Trial, which run the result came from
Once you have this log, you can use it for a wide range of visuals
Note
Library of code snippets coming soon!
(fancy contributing?)
bupaR is an R library for process mining - but we won’t let that stop us!
With our event logs + some handy code snippets, generating some process mapping outputs becomes possible!
There are a few different ways we could get our Python event logs to work with bupaR:
the reticulate package (which runs Python from R) - though due to the complexity of our code, this is likely to run into issues
the r2py package (which runs R from Python) - as we only want a little bit of R in a primarily Python project, this might be a better option
Quarto’s features for passings objects like dataframes between R and Python cells
exporting our event log as a csv, importing this into R, and saving the resulting bupaR visuals
You can choose between mean stage time, max, min, etc.
Find some code to help you do this with your own simulation logs: https://des.hsma.co.uk/process_logs_with_bupar.html
Explore other visuals in their documentation: https://bupaverse.github.io/docs/visualize.html
Chapter coming soon…
Verification = did we build it right (to match our conceptual model, and without bugs?)
Validation = does it actually match the real world well enough to be useful?
Assumptions documented and justified
A lot more than before!
Custom fonts appear to be supported via downloaded font files
The sidebar theme is fully customizable too.
So how would we build up a new candidate combination?
LSOAs (or any other geography) have a concept of neighbours (something sharing a border)
Here, we start with a territory that’s pretty central in the existing territories
On each step, it
This randomness ensures we don’t end up with every player owning the same number of units of territory each time.
We can then generate multiple allocations, which will all (probably) differ.
We can then score each solution on a metric - here, we’re trying to minimize the difference in received calls across each region, so we work out the average across all regions and then work out the absolute difference.
The closer the resulting value is to zero, the better the solution is.
Rather than just randomly generating thousands of candidate solutions, we may move to a better solution faster by varying the best random solutions.
With some more boundary logic, we can create new possible solutions that are a variation on our best solution from the previous step.
We then evaluate again.
And - in theory - we’d keep going for a set number of generations or a defined amount of compute time - and see how good a solution we can come up with!